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We describe a general strategy, PERM ( Pruned- .Enriched Posenbluth Method), for sam- 
pling configurations from a given Gibbs-Boltzmann distribution. The method is not 
based on the Metropolis concept of establishing a Markov process whose stationary 
state is the wanted distribution. Instead, it starts off building instances according to a 
biased distribution, but corrects for this by cloning "good" and killing "bad" configura- 
tions. In doing so, it uses the fact that nontrivial problems in statistical physics are high 
dimensional. Therefore, instances are built step by step, and the final "success" of an 
instance can be guessed at an early stage. Using weighted samples, this is done so that 
the final distribution is strictly unbiased. In contrast to evolutionary algorithms, the 
cloning/killing is done without simultaneously keeping a large population in computer 
memory. We apply this in large scale simulations of homopolymers near the theta and 
unmixing critical points. In addition we sketch other applications, notably to polymers in 
confined geometries and to randomly branched polymers. For theta polymers we confirm 
the very strong logarithmic corrections found in previous work. For critical unmixing 
we essentially confirm the Flory-Huggins mean field theory and the logarithmic correc- 
tions to it computed by Duplantier. We suggest that the latter are responsible for some 
apparent violations of mean field behavior. This concerns in particular the exponent for 
the chain length dependence of the critical density which is 1/2 in Flory-Huggins theory, 
but is claimed to be fa 0.38 in several experiments. 



1 Introduction 

For many statistical physicists, "Monte Carlo" is synonymous for the Metropo- 
lis strategy!! where one sets up an ergodic Markov process which has the desired 
Gibbs-Boltzmann distribution as its unique asymptotic state. There exist numer- 
ous refinements concerned with more efficient transitions in the Markov process (e.g. 
cluster flipsa or pivot movefl), or with distributions biased such that false minima 
are less harmful and that autocorrelations are reduced (e.g. multicanonicau sam- 
pling and simulated tempering?). But most of these schemes remain entirely within 
the framework of the Metropolis strategy. 

On the other hand, there exists a very extensive literature on simulations of 
polymer systems by completely different methods, starting already in the mid- 
50's and continuing until today. Polymers represent a challenge for simulations 
because of the constraints imposed by chain connectivity on the one hand, and 
steric hindrance on the other. These can lead to entanglement effects which seriously 
reduce the mobility of the monomers, and which can render Metropolis type schemes 
inefficient. 

The first such alternative method was invented by Rosenbluth and Rosenbluthi 
They observed that the exponential attrition in straightforward simulations of un- 
biased self avoiding random walks (SAW's) can be strongly reduced by simulating 
a biased sample, and correcting the bias by means of a weight associated to each 
configuration. The biased sample is simply obtained by replacing any "illegal" step 



which would violate the self avoidance constraint by a random "legal" one, provided 
such a legal step exists. More generally, assume we want to simulate a distribution 
in which each configuration C is weighted by a (Boltzmann-)weight Q(C), so that 
for any observable A one has (A) = J^c A(C)Q(C)/ J2c Q(Q- ^ we sample un- 
evenly with probability p(C), then we must compensate this by giving a weight 
<xQ(C)/p{C), 

(A) = hm ^M^Qi^/m s Um lf A(C|MC()i (1) 
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We call this the generalized Roscnbluth method. If p(C) were chosen close to 
Q(C), this would lead to importance sampling and obviously would be very efficient. 
But in general this is not possible, and Eq.(|l]) suffers from the problem that the 
sum is dominated by very few events with high weight. 

Consider now a lattice chain of length iV+1 with self avoidance and with nearest 
neighbor interaction — e between unbonded neighbors. In the original Rosenbluth 
method, p(C) is then a product, 
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p{C) oc JJ — (2) 
J - J - m. 
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where m n is the number of free neighbors in the n-th step, i.e. the number of possible 
lattice sites where to place the n-th monomer (monomers are labeled n — 0, 1, . . . N). 
Similarly, Q(C) is a product, 
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Q(C)=\[e-^, (3) 

71=1 

where (3 — 1/kT and E n = -e^ifco ^kn is the energy of the n-th monomer in the 
field of all previous ones (A/-„ = 1 if and only if monomers k and n are neighbors 
and non-bonded, otherwise Afc„ = 0). 

Obviously, p(C) favors compact configurations where monomers have only few 
free neighbors. This renders the Rosenbluth method unsuitable for long chains, 
except near the collapse ('theta') point where simulations with N < 1000 are feasible 
on the simple cubic latticeQ In general we should find ways to modify the sampling 
so that "good" configurations are sampled more frequently, and "bad" ones less. 
The key to this is the product structure of the weights 

W{Ci) = iif-i^A/ ntr \i ir\ = T~ H Wn{Ci) ( } 

M Efc=i Q(Ck)/p(Ck) z N n=1 

implied by Eqs. (||) and @. Here, Zn — M' 1 J2k=i Q(^k)/p(Ck) is an estimate of 
the partition sum. A similar product structure holds in practically all interesting 
cases. 
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We can thus watch how the weight builds up while the chain is constructed step 
by step. If the partial weight (from now on we drop the dependence on C) 

n 

Wn^Z- 1 Y[ Wj , Kn < N , (5) 

2=1 

gets too large (i.e. is above some threshold W + ), we replace the configuration by k 
copies, each with weight W n /k. One of these copies, is continued to grow, all others 
are placed on a stack for later use. Following Rem we call this 'enrichment'. The 
opposite action, when W n falls below another threshold W~ ('pruning'), is done 
stochastically: With probability 1/2 the configuration is killed and replaced by the 
top of the stack, while its weight is doubled in the other halLpf cases. 

In this way PERM (pruned-enriched Rosenbluth methodEI) gives a sample with 
exactly the right statistical weights, independently of the thresholds W^, the se- 
lection probability p(C), and the clone multiplicity k. But its efficiency depends 
strongly on good choices for these parameters. Notice that one has complete free- 
dom in choosing them, and can even change them during a run. Fortunately, rea- 
sonably good choices are easy to find (more sophisticated choices needed at very 
low temperatures are discussed in RefEElO). The guiding principle for p(C) is that 
it should lead as closely as possible to the correct final distribution, so that pruning 
and enrichment are kept to a minimum. This is also part of the guiding principles 
for W^. In addition, W + and W~ have to be chosen such that roughly the same 
effort is spent on simulating any part of the configuration. For polymers this means 
that the sample size should neither grow nor decrease with- chain length n. This is 
easily done by adjusting W + and W~ 'on the fly', see Refla where a pseudocode is 
given for the entire algorithm. 

In selecting the gopd and killing the bad, PERM is similar to evolutionary 
and pe.yie.tk -aJgorithmspj to population based growth algorithms_for chain poly- 
mers pJiI-Mo to diffusion type quantum Monte Carlo algorithmsP-3 and to the 'go 
with the winners' strategy of Reflia. The main difference with the first three groups 
of methods is that we do not keep the entire population of instances simultaneously 
in computer memory. Indeed, even on the stack we do not keep copies of good con- 
figurations but only the steps involved in constructing the configurations and flags 
telling us when to make a copyQ In genetic algorithms, keeping the entire population 
in memory is needed for cross-overs, and it allows a one-to-one competition between 
instances. But in our case this is not needed since every instance can be compared 
to the average behavior of all others. The same would be true for diffusion type 
quantum Monte Carlo simulations. The main advantage of our strategy is that it 
reduces enormously computer memory. This, together with the surprisingly easy 
determination of the thresholds VT ± , could make PERM also a very useful strategy 
for quantum Monte Carlo simulations. 

In section 2 we will apply PERM to single homopolymers on the simple cubic 
lattice near their theta collapse temperature. This will be extended to semi-dilute 
solutions in Sec. 3, where we shall discuss critical unmixing of very long polymer 
chains. Finally, we shall sketch some further applications and improvements in 
Sec. 4. This includes what we call 'markovian anticipation' and applications to 
2 — d SAW's, to polymers in confined geometries, and to branched polymers. 
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2 The 6 Collapse of Single Chains 



The collapse is a tricritical phenomenon, described by the n — > limit of the 
tricritical Landau-Ginzburg 0(n) modelEj Its upper critical dimension is, as for 
all tricritical 0(n) models, d = 3. Therefore we must expect mean-field critical 
exponents in d = 3, modified only by logarithmic corrections. These corrections 
have been studied in great detail, partly dii^to some long- lasting controversies. 
The nowadays accepted results by DuplantieiEl include among others the following 
predictions: 

(i) The average end-to-end distance for a chain of N elements grows at T = Tg 

as 

Rl, IN « const ( 1 — — ) (6) 

Nl V 3631niVy v ' 

(ii) The specific heat per monomer scales, again exactly at T = Tg, as 

c(T = To) = 1 (J-^j «to 2 ) - (m) 2 ) ~ (In AT) 3 / 11 (7) 

where to is the number of non-bonded nearest neighbor contacts. 

(iii) In the collapsed phase T < Tg, the chain forms a globule with a bulk 
monomer density cf> which scales as 

^(Te-TH-logCTe-T)] 7 / 11 . (8) 

Here we shall only discuss chains modeled as SAW's on the simple cubic lattice 
with nearest neighbor attraction. For simplicity we choose this energy as e = — 1, 
so that each non-bonded next neighbor contact contributes a factor q 

Two-dimensional chains are treated with similar algorithms io Refui'Ea, chains 
on other 3-d lattices and off-lattice chains are discussed in RefH'ElL 

The simulations presented below show most dramatically the power of PERM. 
This depends mainly on the fact that the above logarithmic corrections are weak, 
and thus a simple random walk needs very little pruning and enrichment. To a very 
good approximation, the chain length n performs during the simulation a random 
walk. Each addition of a monomer is a forward step of this walk, each pruning 
event is a jump backward to the next value of n on the stack. The efficiency of the 
algorithm is directly measured by the diffusion constant D of this walk, defined by 

((n 2 - m) 2 ) = 2Dt, (9) 

where t is the number of steps (=subroutine function calls in the recursive imple- 
mentation of Rem) performed between lengths m and ri2- The larger D, the faster 
a long chain is built up and disassembled again. Measurements gave D ss 2000 
right at Tj-^-Tg. This is to be compared to D — 0(1) for other chain growth 
algorithmsc3'E3 

Results of simulations with N = 10, 000 are shown in Figs. 1 and 2. Both 
demonstrate clearly that the dominant behavior is mean field like, but the detailed 
predictions of Eqs.(||) and (Q) are not verified. Indeed, the corrections are much 
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larger than predicted. It is not clear what is the source of these problems. It could 
be that higher order logarithmic corrections are important. But it could also be that 
multi-body forces between more than 3 monomers are important. Asymptotically, 
for N — > co, one expects three-body forces to be the only relevant ones (and this 
underlies Eqs.(^|) and but at finite N this might not be true. In any case, any 
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Figure 1: End-to-end swelling factor R%/N against 1/fn.iV, compared to previouS MC 
estimates (O) and to leading logarithmic prediction (dotted line). Curves are for e 1//feT = 
1.300, 1.305, 1.310, 1.315 (top to bottom). 
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Figure 2: Variance of contact energies (~ specific heat) per monomer, for the same tem- 
peratures as in Fig. 1 (q = e 1/kT ). Theory predicts C/N ~ (log N) 3/11 . 
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4x10° 6x10° 
chain length N 

Figure 3: Logarithms of partition sums (negative free energies) of single chains in a box 
of size 256 x 256 x 256. Notice that these are total free energies, not free energies per 
monomer. Fugacities are adjusted such that both maxima have the same height. The 
locations of the right-hand maxima give then estimates of $00 (T) . 
Results of this procedure are shown in Fig. 4. 
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Figure 4: Plus signs (+): log-log plot of the infinite- N monomer density <p against Tg —T. 
Crosses (x): log- log plot of critical monomer densities at finite N, plotted against Tg — 
T C (N), where T C (N) is the TV— dependent critical temperature for unmixing. 
Dashed line: theoretical prediction, up to arbitrary factor. 

attempQ to interpret present experimental data by means of Eqs.(||) and (Q) is 
likely to give wrong conclusions. 

In order to test prediction (iii) directly, one would have to simulate extremely 
long chains (N 3> 10 6 ), since otherwise results are strongly influenced by the surface 
of the globule. As an alternative, one can simulate systems without any surface by 
squeezing a chain into a cube with periodic boundary conditions. Above the 0- 
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point, the free energy will be a (cap-)convex function of N, while it should be 
non-convex for T < Tg. From this it should be possible to estimate both Tg and 
the density 4>. We were able to simulate chains of length up to 1.6 x 10 6 , in lattices 
of up to 2 25 sites. Some typical results are shown in Fig. 3. They show that Tg 
can indeed be estimated reliably, with results agreeing perfectly with those from 
Figs. 1 and 2. Values of <p obtained from Fig. 3 and similar plots are shown in 
Fig. 4, together with critical densities of multichain systems discussed in the next 
section. The dotted line in Fig. 4 represents the prediction (Js[) . We see very good 
agreement, much better than expected after having seen Figs. 1 and 2. 



3 Critical Unmixing 



Far below the collapse temperature, it is intuitively obvious that not only different 
parts of one long chain will coalesce, but also different chains will lump together and 
precipitate. For finite but large TV one expects therefore an unmixing transition at 
a critical temperature T C {N) which approaches Tg from below when TV — > oo. For 
any fixed TV this will be a critical point in the Ising universality class, but additional 
scaling laws are expected as TV — > oo. A schematic phase diagram is shown in Fig. 5. 



T C (N) 




Figure 5: Schematic phase diagram for semi-dilute solutions of chain polymers. Upper- 
most curve: monomer concentration inside an infinitely large collapsed globule under zero 
outside pressure. Lower curves: coexistence curves for fixed chain length TV. Curves 
are strictly monotonically ordered, with the coexistence curve for TV below that for TV' if 
TV < TV'. 

For large TV one expects critical densities cj) c (N), deviations of T C (N) from Tg, 
and critical amplitudes — <f)^ to scale with TV, 



(2) 



(1) ~ (T c - TfN- 

4r. ~ N~ X2 , 



(10) 

(11) 
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T e - T C {N) ~ N- X3 . (12) 

Flory-Huggins theorjES predicts x\ = 1/4, X2 = £3 = 1/2. Notice that these predic- 
tions stand on rather different footings. Exponents X2 and £3 actually describe only 
properties which are not influenced by critical Ising-type fluctuations, and should 
therefore be correctly predicted by Flory-Huggins theory. In contrast, one has to 
expect that x\ is heavily influenced by anomalous Ising scaling, since it appears in 
conjunction with fi which is different from its mean-field value in d = 3. Indeed we 
had problems in measuring x\ reliablyo Therefore we shall discuss in the following 
only X2 and X3, and the chain swelling R 2 N /N. Naively one should expect that 
chains are ideal a± .the, .critical point. 

Experimentl^'EiKSHoS suggest very strongly that £3 is smaller than its 
mean-field value, x$ — 0.38 ± 0.01. This prohljnm has lead to numerous specula- 
tionsE3'E3t3 In particular, it was suggest ed Ip O'Eia that chains are already partly 
collapsed at T C (N). For a review, see Refo. 

Simulations with PERM are straightforward. We just have to place every iV-th 
monomer not next to the previous one but at a random free site. In order to get the 
correct statistics (different chains are identical) we have to divide the partition sum 
by a factor X/M\ for a system with M chains. We simulated chain lengths between 
N = 8 and N — 4096. The latter is much larger than in any other simulation (the 
longest being N = 1000 in RefEl). 

Except for N = 4096, lattice sizes were such that the critical systems had 
roughly 50 to 100 chains. Although this is at least as large as in most previous 
simulations, it is not enough to be free from very large finite size corrections. Ih. 
order to cope with them, we used a double histogram method as suggested in RefuHl 
and applied to the present problem i^RcfHcJ. The main advantage of this method 
is that it allows to use field mixingp to get rid of the very strong asymmetry of 
the histogram and to reduce the problem essentially to the well understood Ising 
problem where the symmetry m ~m allows to pin down v£ry_precisely the critical 
point. But the efficiency of field mixing was checked in Refc3cil only in so far as it 
rendered the distribution of chain numbers M symmetric. The stronger requirement 
that also the_jniat (M, ^-distribution (_E=energy) should be symmetric was not 
tested in RefO'cil due to lack of sufficient statistics. In our simulations^! we found 
that this symmetry is indeed badly violated. We have no explanation for this, and 
no remedy. It is mostly this problem which prevented us from obtaining reliable 
estimates of Xi, see the discussion in Refo. But other findings are much less affected 
by it. 

Our first result is that chains arc not collapsed at the critical point but slightly 
swollen. This agrees with Refc3. The swelling is only weakly dependent on the 
monomer concentration (f>. Results obtained exactly at the critical point are shown 
in Fig. 6. They show that the swelling agrees in the limit N — > 00 perfectly with 
that of isolated chains. 

Log-log plots of <j) c (N) versus T g -T C (N), of (Tg -T C (N))/T C (N) versus N, and 
of <fic(N) versus N are shown in Figs. 4, 7, and 8. From Fig. 7 we see clearly that 
X3 « 1/2, with surprisingly small logarithmic corrections. This might be due to our 
choice of scaling variable. Using (Tg - T C (N))/Tg instead of (Tg - T C (N))/T C (N), 
though being equivalent for large N, would give much worse scaling. Nevertheless, 
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Fig. 7 suggests strongly that X3 is equal to its mean field value. This is in marked 
contrast to X2 ■ 
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Figure 6: Swelling factors R%/N at the critical point plotted against N. The dashed line 
is a fit with the function 1.8 — 0.9s -0 ' 36 . This fit has no particular significance except for 
the fact that the limit TV — ► 00 agrees with the swelling of infinitely long free O polymers. 
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Figure 7: Logdog plot of (Te - T C (N))/T C (N) against N (+). The dashed line has slope 
-0.51. 

At first glance, J^ig. 8 seems to give 22 ^^^^-hl-pcrfect agreement with 
previous simulationoo and with experiment EjuHoEjO But we see already in 
Fig. 8 that there are strong deviations at small values of TV, suggesting that this is 
strongly affected by logarithmic corrections. This suspicion is confirmed by Fig. 4 
and by a strong theore tical arguments: if X3 = 1/2 (as confirmed by our simulations 
and by previous workoo) , and if coexistence curves are monotonically ordered as 
indicated in Fig. 5 (and as assumed by all previous authors), then X2 is related to 
the dependence of <f>{T) for isolated infinitely long chains: <j){T) ~ T z with z < 2x2- 



Thus X2 < 1/2 would be inconsistent with the generally accepted value z = 1 
which we had also been confirmed by our simulations presented in Sec. 2. We thus 
conclude, in striking contrast to most previous authors, that also Xi has its mean 
field value, and that all observed deviations are (expected!) logarithmic corrections. 
Further arguments in favor of this, interpretation, based on an explicit evaluation of 
the free energy, are given in RefEl 



0.1 



10 



100 



1000 



Figure 8: Log-log plot of (f> c (N) against N. The dashed line (slope —0.385) fits data for 
large N, but deviations at small N are substantial. 



4 Further Developments 

4-1 Markovian Anticipation and 2- Dimensional SAW's 

As we had pointed out already in the introduction, PERM becomes most efficient if 
the a priori probabilities p(C) are such that exactly the right distribution is obtained 
even without pruning and enrichment. In SAW's, this means that we should not 
choose randomly among the free neighbors when placing the next monomer. One 
way to "guide" the growth consists in looking ahead. In the 'scanning method' of 
MeirovitclO one looks k steps ahead by checking all possible fc-step extensions, and 
decides on the success of these extensions which single step to take next. This is 
efficient, but also very time consuming: the effort increases exponentially with k. 

An alternative which costs hardly anything in terms of CPU time (but which 
requires rather large memory, if pushed to the extreme) consists essentially in look- 
ing back k steps. Assume we are dealing with a lattice with coordination number 
Af. During the initial steps of the simulation we build up two histograms H (i) and 
Hi{i) of size J\f( k+1 '> each. Here i = 0, . . .AA fe+1 ) — 1 labels all possible recent k- 
step histories and their one-step extensions. The first histogram Hq{i) contains the 
weights of these (fc + l)-step paths at the moment when the last step is made. The 
second histogram H±(i) contains the weights with which the same paths contribute 
much later (we used 100 steps in our simulations) to the partition sum. The ratio 
Hi(i) / Ho(i) is therefore a direct estimate of how much the last step in path i con- 
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tributed to the final partition sum. Let us denote this last step as j (= 0, . . . A/" — 1) 
and the previous k steps as iq, so that i = (io,j)- Then we propose 



PU M = ^ „ ,. — .,,,„ ,. — tjz (13) 

as the best anticipation where to go next, based on a memory of fc steps. 

Equation gives indeed a substantial improvement for 2-d SAW's on all tested 
lattices (square, triangular, honeycomb, Manhatten) . Using k such that J\f^ k+1 ^ w 
10 6 we get in all cases an increase by roughly a factor 10 in the effective diffusion 
constant D (see Eq.(||)). This means that the number of independent chains is 
increased by roughly one order of magnitude over the uniform choice. 

Applications and details will be published clsewhcreE-3 Here we just mention 
that this allows e.g. very efficient simulations in confined geometries. To illustrate 
this, we show in Fig. 9 a single chain of length N = 60, 000 in a slit of width 160. For 
convenience of plotting this was cut into 6 pieces. Such simulations allow significant 
tests of predictions about the monomer density in the vicinity of a hard wallEil 
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Figure 9: A single chain of length iV = 60, 000 in a strip of width 160 of a square lattice. 
In order to plot the configuration, it is cut in 6 pieces. 

4-2 Lattice Animals 

In our last example we want to illustrate that the strategy of PERM - starting off 
with a biased selection, weighting it properly, and using these weights to redirect 
the selection by cloning and killing - can also be implemented quite differently. 
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Consider the set of all connected clusters C n of n sites on a regular lattice, 
with the origin being one of these sites, and with a weight Q(C„) defined on each 
cluster. The (n-site) lattice animal problem is defined by giving the same weight to 
each cluster. The last requirement distinguishes animal statistics from statistics of 
percolation clusters. Take site percolation for definiteness, with 'wetting' probability 
p. Then a cluster of n sites with b boundary sites carries a weight p n (l — p) b in 
the percolation ensemble, while its weight in the animals ensemble is independent 
of b. In the limit p — > this difference disappears obviously, and the two statistics 
coincide. Due to universality, we expect indeed that the scaling behavior is the 
same for any value of p less than the critical percolation threshold p c . 

While there exists no simple and efficient algorithm for simulating large animals 
(for a recent Rosenbluth type algorithm see Refca), there exist very simple and 
efficient algorithms for percolation clusters. The best known is presumably the 
Leath algorithm^ which constructs the cluster in a "breadth first" (i.e., -layer by 
layer) way, the easiest to implement is a recursive "depth first" algorithms 

Our PERM strategy consists now in starting off to generate subcritical perco- 
lation clusters by the Leath method, and in making clones of those growing clusters 
which contribute more than average to the animal cnsembleua Since we work at 
p < Pc each cluster growth would stop sooner or later if there were no enrichment. 
Therefore we do not need explicit pruning. The threshold W + for cloning is chosen 
such that it depends both on the present animal weight and on the anticipated 
weight at the end of growth. 



Fig. 10: A typical lattice animal with 8000 sites on the square lattice. 

Usually, with growth algorithms like the Leath method, cluster statistics is up- 
dated only after clusters have stopped growing. But, as outlined below, one can also 
include contributions of still growing clusters. For percolation, this reduces slightly 
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the statistical fluctuations of the cluster size distributions, but the improvement is 
small. On the other hand, this improved strategy is crucial when using PERM to 
estimate animal statistics. 

Consider a growing cluster during Leath growth. It contains n wetted sites, b 
boundary sites which are already known to be non-wetted, and g boundary sites at 
which the cluster can still grow since their status has not yet been decided ( "growth 
sites"). This cluster will contribute to the percolation ensemble only if growth 
actually stops at all growth sites, i.e. with weight (1— p) 9 . Since the relative weights 
of the percolation and animal ensembles differ by a factor (1 — p)( b +s) (since now 
b+g is the total number of boundary sites), this cluster has weight W(C) oc (1 — p)~ b 
in the animal ensemble. If we would use only this weight as a guide for cloning, 
we would clone if W(C) is larger than some W + which is independent of b and g, 
and which depends on n in such a way that the sample size becomes independent of 
n. But clusters with many growth sites will of course have a bigger chance to keep 
growing and will contribute more to the precious statistics of very large clusters. 
It is not a priori clear what is the optimal choice for W + in view of this, but 
numerically we obtained best results for W + oc (1 — p) 9 . 

In this way we were able to obtain good statistics for animals of several thousand 
sites, independent of the dimension of the lattice. A typical 2-d animal with 8000 
sites is shown in Fig. 10. We were also able to simulate animal collapse (when each 
nearest neighbor pair contributes — e to the energy), and animals near an adsorbing 
surface. Details will be published in Refua 

5 Discussion 

We have presented a general strategy (PERM) for sampling from any given proba- 
bility distribution. The main idea is to follow initially a biased distribution which is 
more easy to simulate. This bias is taken into account by re-weighting the sample, 
and these weights are used to interfere by pruning and cloning. For this to be possi- 
ble it is needed that construction of instances is done in many small steps (which is 
always the case in statistical physics) , and that the initial growth of weights is not 
misleading. It is mainly the second requirement which limits the usefulness of our 
approach. In the 3-d Ising model, for instance, our approach gave rather mediocre 
results. But in other cases it is much more efficient. For polymers near the O 
collapse point, in particular, it seems by far the most efficient method presently 
known. 

The method shows some similarity to genetic and other evolutionary algorithms 
since good ('fit') instances are multiplied, while bad ('unfit') ones are killed. But 
in contrast to them this is done such that the wanted probability distribution is 
respected exactly, and it is done without the need for keeping large populations of 
instances in computer memory. We believe that it is mainly these features which 
make our strategy a promising candidate for further applications—Several such 
applications to toy protein models are discussed in these proceedingsEHfEirEj Further 
success will depend strongly on whether good non-trivial choices for p(C) (such as in 
markovian anticipation, Sec. 3a) or for W ± (such as in Sec. 3b) can be found in the 
specific problem at hand. Another promising avenue to follow is the combination 
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of our strategy with more conventional (Metropolis type) concepts. If this merge 
succeeds, PERM might lead to an efficient algorithms for finding native states of 
real proteins. 

The authors are most grateful to Gerard Barkema for numerous discussions. 
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